Overview

Dataset statistics

Number of variables24
Number of observations3799
Missing cells7136
Missing cells (%)7.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.3 MiB
Average record size in memory631.6 B

Variable types

Numeric11
Categorical13

Alerts

society has a high cardinality: 676 distinct values High cardinality
sector has a high cardinality: 113 distinct values High cardinality
areaWithType has a high cardinality: 2355 distinct values High cardinality
price is highly correlated with property_type and 3 other fieldsHigh correlation
price_per_sqft is highly correlated with priceHigh correlation
area is highly correlated with built_up_area and 1 other fieldsHigh correlation
bedRoom is highly correlated with property_type and 5 other fieldsHigh correlation
bathroom is highly correlated with property_type and 5 other fieldsHigh correlation
super_built_up_area is highly correlated with price and 3 other fieldsHigh correlation
built_up_area is highly correlated with areaHigh correlation
carpet_area is highly correlated with areaHigh correlation
servant room is highly correlated with bedRoom and 2 other fieldsHigh correlation
property_type is highly correlated with price and 4 other fieldsHigh correlation
balcony is highly correlated with bedRoom and 1 other fieldsHigh correlation
floorNum is highly correlated with property_typeHigh correlation
agePossession is highly correlated with property_typeHigh correlation
facing has 1104 (29.1%) missing values Missing
super_built_up_area has 1886 (49.6%) missing values Missing
built_up_area has 2067 (54.4%) missing values Missing
carpet_area has 1858 (48.9%) missing values Missing
luxury_score has 147 (3.9%) missing values Missing
area is highly skewed (γ1 = 30.216827) Skewed
built_up_area is highly skewed (γ1 = 41.20580346) Skewed
carpet_area is highly skewed (γ1 = 24.77697875) Skewed
df_index is uniformly distributed Uniform
df_index has unique values Unique
floorNum has 134 (3.5%) zeros Zeros
luxury_score has 457 (12.0%) zeros Zeros

Reproduction

Analysis started2023-09-12 19:23:41.771088
Analysis finished2023-09-12 19:24:48.174621
Duration1 minute and 6.4 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct3799
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1900.046065
Minimum0
Maximum3802
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size29.8 KiB
2023-09-13T00:54:48.525689image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile189.9
Q1949.5
median1899
Q32851.5
95-th percentile3612.1
Maximum3802
Range3802
Interquartile range (IQR)1902

Descriptive statistics

Standard deviation1098.081624
Coefficient of variation (CV)0.5779236853
Kurtosis-1.199926791
Mean1900.046065
Median Absolute Deviation (MAD)951
Skewness0.001656186977
Sum7218275
Variance1205783.253
MonotonicityStrictly increasing
2023-09-13T00:54:48.933204image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
33271
 
< 0.1%
33031
 
< 0.1%
12581
 
< 0.1%
33071
 
< 0.1%
12621
 
< 0.1%
33111
 
< 0.1%
12661
 
< 0.1%
33151
 
< 0.1%
12701
 
< 0.1%
Other values (3789)3789
99.7%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
38021
< 0.1%
38011
< 0.1%
38001
< 0.1%
37991
< 0.1%
37981
< 0.1%
37971
< 0.1%
37961
< 0.1%
37951
< 0.1%
37941
< 0.1%
37931
< 0.1%

property_type
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size227.3 KiB
flat
2939 
house
860 

Length

Max length5
Median length4
Mean length4.226375362
Min length4

Characters and Unicode

Total characters16056
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowflat
2nd rowflat
3rd rowflat
4th rowflat
5th rowflat

Common Values

ValueCountFrequency (%)
flat2939
77.4%
house860
 
22.6%

Length

2023-09-13T00:54:49.366173image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-09-13T00:54:49.747618image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
flat2939
77.4%
house860
 
22.6%

Most occurring characters

ValueCountFrequency (%)
f2939
18.3%
l2939
18.3%
a2939
18.3%
t2939
18.3%
h860
 
5.4%
o860
 
5.4%
u860
 
5.4%
s860
 
5.4%
e860
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter16056
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f2939
18.3%
l2939
18.3%
a2939
18.3%
t2939
18.3%
h860
 
5.4%
o860
 
5.4%
u860
 
5.4%
s860
 
5.4%
e860
 
5.4%

Most occurring scripts

ValueCountFrequency (%)
Latin16056
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f2939
18.3%
l2939
18.3%
a2939
18.3%
t2939
18.3%
h860
 
5.4%
o860
 
5.4%
u860
 
5.4%
s860
 
5.4%
e860
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII16056
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f2939
18.3%
l2939
18.3%
a2939
18.3%
t2939
18.3%
h860
 
5.4%
o860
 
5.4%
u860
 
5.4%
s860
 
5.4%
e860
 
5.4%

society
Categorical

HIGH CARDINALITY

Distinct676
Distinct (%)17.8%
Missing1
Missing (%)< 0.1%
Memory size274.3 KiB
independent
486 
tulip violet
 
75
ss the leaf
 
74
shapoorji pallonji joyville gurugram
 
44
dlf new town heights
 
42
Other values (671)
3077 

Length

Max length49
Median length39
Mean length16.9181148
Min length1

Characters and Unicode

Total characters64255
Distinct characters41
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique292 ?
Unique (%)7.7%

Sample

1st rowsignature global park 4
2nd rowsmart world gems
3rd rowpyramid elite
4th rowbreez global hill view
5th rowbestech park view sanskruti

Common Values

ValueCountFrequency (%)
independent486
 
12.8%
tulip violet75
 
2.0%
ss the leaf74
 
1.9%
shapoorji pallonji joyville gurugram44
 
1.2%
dlf new town heights42
 
1.1%
signature global park37
 
1.0%
shree vardhman victoria35
 
0.9%
smart world orchard33
 
0.9%
emaar mgf emerald floors premier32
 
0.8%
smart world gems32
 
0.8%
Other values (666)2908
76.5%

Length

2023-09-13T00:54:50.149135image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
independent491
 
4.9%
the362
 
3.6%
dlf225
 
2.2%
park219
 
2.2%
city172
 
1.7%
global165
 
1.6%
signature161
 
1.6%
emaar159
 
1.6%
m3m156
 
1.6%
heights138
 
1.4%
Other values (783)7767
77.6%

Most occurring characters

ValueCountFrequency (%)
e6932
 
10.8%
6219
 
9.7%
a6084
 
9.5%
r4348
 
6.8%
n4268
 
6.6%
i3965
 
6.2%
t3848
 
6.0%
s3621
 
5.6%
l3066
 
4.8%
o2861
 
4.5%
Other values (31)19043
29.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter57470
89.4%
Space Separator6219
 
9.7%
Decimal Number548
 
0.9%
Other Punctuation10
 
< 0.1%
Dash Punctuation8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e6932
12.1%
a6084
 
10.6%
r4348
 
7.6%
n4268
 
7.4%
i3965
 
6.9%
t3848
 
6.7%
s3621
 
6.3%
l3066
 
5.3%
o2861
 
5.0%
d2548
 
4.4%
Other values (16)15929
27.7%
Decimal Number
ValueCountFrequency (%)
3216
39.4%
283
 
15.1%
176
 
13.9%
661
 
11.1%
834
 
6.2%
419
 
3.5%
517
 
3.1%
915
 
2.7%
714
 
2.6%
013
 
2.4%
Other Punctuation
ValueCountFrequency (%)
,7
70.0%
/2
 
20.0%
.1
 
10.0%
Space Separator
ValueCountFrequency (%)
6219
100.0%
Dash Punctuation
ValueCountFrequency (%)
-8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin57470
89.4%
Common6785
 
10.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e6932
12.1%
a6084
 
10.6%
r4348
 
7.6%
n4268
 
7.4%
i3965
 
6.9%
t3848
 
6.7%
s3621
 
6.3%
l3066
 
5.3%
o2861
 
5.0%
d2548
 
4.4%
Other values (16)15929
27.7%
Common
ValueCountFrequency (%)
6219
91.7%
3216
 
3.2%
283
 
1.2%
176
 
1.1%
661
 
0.9%
834
 
0.5%
419
 
0.3%
517
 
0.3%
915
 
0.2%
714
 
0.2%
Other values (5)31
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII64255
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e6932
 
10.8%
6219
 
9.7%
a6084
 
9.5%
r4348
 
6.8%
n4268
 
6.6%
i3965
 
6.2%
t3848
 
6.0%
s3621
 
5.6%
l3066
 
4.8%
o2861
 
4.5%
Other values (31)19043
29.6%

sector
Categorical

HIGH CARDINALITY

Distinct113
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size246.2 KiB
sohna road
 
163
sector 102
 
112
sector 85
 
110
sector 92
 
105
sector 69
 
94
Other values (108)
3215 

Length

Max length26
Median length9
Mean length9.323506186
Min length7

Characters and Unicode

Total characters35420
Distinct characters31
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowsector 36
2nd rowsector 89
3rd rowsector 86
4th rowsohna road
5th rowsector 92

Common Values

ValueCountFrequency (%)
sohna road163
 
4.3%
sector 102112
 
2.9%
sector 85110
 
2.9%
sector 92105
 
2.8%
sector 6994
 
2.5%
sector 6590
 
2.4%
sector 9090
 
2.4%
sector 8190
 
2.4%
sector 10988
 
2.3%
sector 7980
 
2.1%
Other values (103)2777
73.1%

Length

2023-09-13T00:54:50.593787image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
sector3565
46.7%
road187
 
2.5%
sohna175
 
2.3%
102112
 
1.5%
85110
 
1.4%
92105
 
1.4%
6994
 
1.2%
6590
 
1.2%
9090
 
1.2%
8190
 
1.2%
Other values (106)3010
39.5%

Most occurring characters

ValueCountFrequency (%)
o3938
11.1%
3829
10.8%
s3820
10.8%
r3819
10.8%
e3656
10.3%
c3618
10.2%
t3576
10.1%
11104
 
3.1%
0825
 
2.3%
8806
 
2.3%
Other values (21)6429
18.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter24082
68.0%
Decimal Number7509
 
21.2%
Space Separator3829
 
10.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o3938
16.4%
s3820
15.9%
r3819
15.9%
e3656
15.2%
c3618
15.0%
t3576
14.8%
a730
 
3.0%
d263
 
1.1%
n230
 
1.0%
h213
 
0.9%
Other values (10)219
 
0.9%
Decimal Number
ValueCountFrequency (%)
11104
14.7%
0825
11.0%
8806
10.7%
9803
10.7%
6759
10.1%
7707
9.4%
3699
9.3%
2698
9.3%
5608
8.1%
4500
6.7%
Space Separator
ValueCountFrequency (%)
3829
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin24082
68.0%
Common11338
32.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o3938
16.4%
s3820
15.9%
r3819
15.9%
e3656
15.2%
c3618
15.0%
t3576
14.8%
a730
 
3.0%
d263
 
1.1%
n230
 
1.0%
h213
 
0.9%
Other values (10)219
 
0.9%
Common
ValueCountFrequency (%)
3829
33.8%
11104
 
9.7%
0825
 
7.3%
8806
 
7.1%
9803
 
7.1%
6759
 
6.7%
7707
 
6.2%
3699
 
6.2%
2698
 
6.2%
5608
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII35420
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o3938
11.1%
3829
10.8%
s3820
10.8%
r3819
10.8%
e3656
10.3%
c3618
10.2%
t3576
10.1%
11104
 
3.1%
0825
 
2.3%
8806
 
2.3%
Other values (21)6429
18.2%

price
Real number (ℝ≥0)

HIGH CORRELATION

Distinct473
Distinct (%)12.5%
Missing18
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean2.507476858
Minimum0.07
Maximum31.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size29.8 KiB
2023-09-13T00:54:50.970597image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0.07
5-th percentile0.37
Q10.94
median1.5
Q32.7
95-th percentile8.49
Maximum31.5
Range31.43
Interquartile range (IQR)1.76

Descriptive statistics

Standard deviation2.951218775
Coefficient of variation (CV)1.176967502
Kurtosis15.24237484
Mean2.507476858
Median Absolute Deviation (MAD)0.71
Skewness3.309621269
Sum9480.77
Variance8.709692256
MonotonicityNot monotonic
2023-09-13T00:54:51.341264image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.2583
 
2.2%
0.968
 
1.8%
1.166
 
1.7%
1.266
 
1.7%
1.566
 
1.7%
1.463
 
1.7%
1.359
 
1.6%
256
 
1.5%
0.9556
 
1.5%
151
 
1.3%
Other values (463)3147
82.8%
ValueCountFrequency (%)
0.071
 
< 0.1%
0.161
 
< 0.1%
0.171
 
< 0.1%
0.191
 
< 0.1%
0.29
0.2%
0.216
0.2%
0.229
0.2%
0.231
 
< 0.1%
0.247
0.2%
0.2511
0.3%
ValueCountFrequency (%)
31.51
 
< 0.1%
27.51
 
< 0.1%
262
0.1%
251
 
< 0.1%
241
 
< 0.1%
231
 
< 0.1%
221
 
< 0.1%
203
0.1%
19.52
0.1%
193
0.1%

price_per_sqft
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2651
Distinct (%)70.1%
Missing18
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean13806.58556
Minimum4
Maximum600000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size29.8 KiB
2023-09-13T00:54:51.762475image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile4723
Q16808
median9000
Q313769
95-th percentile33333
Maximum600000
Range599996
Interquartile range (IQR)6961

Descriptive statistics

Standard deviation23063.24874
Coefficient of variation (CV)1.670452745
Kurtosis186.8598359
Mean13806.58556
Median Absolute Deviation (MAD)2764
Skewness11.43378364
Sum52202700
Variance531913442.4
MonotonicityNot monotonic
2023-09-13T00:54:52.138074image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000028
 
0.7%
800019
 
0.5%
1250017
 
0.4%
500017
 
0.4%
666614
 
0.4%
1111114
 
0.4%
750014
 
0.4%
833313
 
0.3%
2222213
 
0.3%
600011
 
0.3%
Other values (2641)3621
95.3%
(Missing)18
 
0.5%
ValueCountFrequency (%)
41
< 0.1%
51
< 0.1%
71
< 0.1%
91
< 0.1%
531
< 0.1%
571
< 0.1%
582
0.1%
601
< 0.1%
611
< 0.1%
791
< 0.1%
ValueCountFrequency (%)
6000001
< 0.1%
4000001
< 0.1%
3157891
< 0.1%
3083331
< 0.1%
2909481
< 0.1%
2833331
< 0.1%
2666661
< 0.1%
2611941
< 0.1%
2453981
< 0.1%
2416661
< 0.1%

area
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct1312
Distinct (%)34.7%
Missing18
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean2847.559905
Minimum50
Maximum875000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size29.8 KiB
2023-09-13T00:54:52.562687image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum50
5-th percentile519
Q11224
median1725
Q32295
95-th percentile4200
Maximum875000
Range874950
Interquartile range (IQR)1071

Descriptive statistics

Standard deviation22795.33415
Coefficient of variation (CV)8.005216715
Kurtosis973.1635807
Mean2847.559905
Median Absolute Deviation (MAD)525
Skewness30.216827
Sum10766624
Variance519627258.8
MonotonicityNot monotonic
2023-09-13T00:54:52.961749image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
165055
 
1.4%
135051
 
1.3%
180048
 
1.3%
195044
 
1.2%
324043
 
1.1%
270039
 
1.0%
90039
 
1.0%
200035
 
0.9%
225025
 
0.7%
240025
 
0.7%
Other values (1302)3377
88.9%
ValueCountFrequency (%)
504
0.1%
551
 
< 0.1%
561
 
< 0.1%
571
 
< 0.1%
602
0.1%
611
 
< 0.1%
672
0.1%
701
 
< 0.1%
721
 
< 0.1%
761
 
< 0.1%
ValueCountFrequency (%)
8750001
< 0.1%
6428571
< 0.1%
6200001
< 0.1%
5666671
< 0.1%
2155171
< 0.1%
989781
< 0.1%
827811
< 0.1%
655172
0.1%
652611
< 0.1%
582281
< 0.1%

areaWithType
Categorical

HIGH CARDINALITY

Distinct2355
Distinct (%)62.0%
Missing0
Missing (%)0.0%
Memory size411.3 KiB
Plot area 360(301.01 sq.m.)
 
37
Plot area 300(250.84 sq.m.)
 
26
Plot area 200(167.23 sq.m.)
 
19
Plot area 502(419.74 sq.m.)
 
19
Super Built up area 1578(146.6 sq.m.)
 
17
Other values (2350)
3681 

Length

Max length124
Median length119
Mean length53.84337984
Min length12

Characters and Unicode

Total characters204551
Distinct characters35
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1784 ?
Unique (%)47.0%

Sample

1st rowSuper Built up area 1081(100.43 sq.m.)Carpet area: 650 sq.ft. (60.39 sq.m.)
2nd rowCarpet area: 1103 (102.47 sq.m.)
3rd rowCarpet area: 58141 (5401.48 sq.m.)
4th rowBuilt Up area: 1000 (92.9 sq.m.)Carpet area: 585 sq.ft. (54.35 sq.m.)
5th rowSuper Built up area 1995(185.34 sq.m.)Built Up area: 1615 sq.ft. (150.04 sq.m.)Carpet area: 1476 sq.ft. (137.12 sq.m.)

Common Values

ValueCountFrequency (%)
Plot area 360(301.01 sq.m.)37
 
1.0%
Plot area 300(250.84 sq.m.)26
 
0.7%
Plot area 200(167.23 sq.m.)19
 
0.5%
Plot area 502(419.74 sq.m.)19
 
0.5%
Super Built up area 1578(146.6 sq.m.)17
 
0.4%
Plot area 270(225.75 sq.m.)17
 
0.4%
Super Built up area 1950(181.16 sq.m.)Carpet area: 1161 sq.ft. (107.86 sq.m.)17
 
0.4%
Super Built up area 1350(125.42 sq.m.)17
 
0.4%
Super Built up area 1650(153.29 sq.m.)Carpet area: 1022.58 sq.ft. (95 sq.m.)15
 
0.4%
Plot area 150(125.42 sq.m.)14
 
0.4%
Other values (2345)3601
94.8%

Length

2023-09-13T00:54:53.465414image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
area5722
18.5%
sq.m3775
12.2%
up3099
 
10.0%
built2390
 
7.7%
super1913
 
6.2%
sq.ft1777
 
5.7%
sq.m.)carpet1206
 
3.9%
carpet731
 
2.4%
sq.m.)built707
 
2.3%
plot682
 
2.2%
Other values (2846)8955
28.9%

Most occurring characters

ValueCountFrequency (%)
27158
 
13.3%
.20885
 
10.2%
a13521
 
6.6%
r9712
 
4.7%
e9576
 
4.7%
19454
 
4.6%
s7739
 
3.8%
q7603
 
3.7%
t7499
 
3.7%
p6953
 
3.4%
Other values (25)84451
41.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter84829
41.5%
Decimal Number48371
23.6%
Space Separator27158
 
13.3%
Other Punctuation24012
 
11.7%
Uppercase Letter8821
 
4.3%
Close Punctuation5680
 
2.8%
Open Punctuation5680
 
2.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a13521
15.9%
r9712
11.4%
e9576
11.3%
s7739
9.1%
q7603
9.0%
t7499
8.8%
p6953
8.2%
u6925
8.2%
m5690
6.7%
l3781
 
4.5%
Other values (5)5830
6.9%
Decimal Number
ValueCountFrequency (%)
19454
19.5%
06787
14.0%
25846
12.1%
54848
10.0%
34068
8.4%
43809
7.9%
63765
 
7.8%
73336
 
6.9%
83233
 
6.7%
93225
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
B3099
35.1%
C1941
22.0%
S1913
21.7%
U1186
 
13.4%
P682
 
7.7%
Other Punctuation
ValueCountFrequency (%)
.20885
87.0%
:3127
 
13.0%
Space Separator
ValueCountFrequency (%)
27158
100.0%
Close Punctuation
ValueCountFrequency (%)
)5680
100.0%
Open Punctuation
ValueCountFrequency (%)
(5680
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common110901
54.2%
Latin93650
45.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a13521
14.4%
r9712
10.4%
e9576
10.2%
s7739
8.3%
q7603
8.1%
t7499
8.0%
p6953
7.4%
u6925
7.4%
m5690
 
6.1%
l3781
 
4.0%
Other values (10)14651
15.6%
Common
ValueCountFrequency (%)
27158
24.5%
.20885
18.8%
19454
 
8.5%
06787
 
6.1%
25846
 
5.3%
)5680
 
5.1%
(5680
 
5.1%
54848
 
4.4%
34068
 
3.7%
43809
 
3.4%
Other values (5)16686
15.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII204551
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
27158
 
13.3%
.20885
 
10.2%
a13521
 
6.6%
r9712
 
4.7%
e9576
 
4.7%
19454
 
4.6%
s7739
 
3.8%
q7603
 
3.7%
t7499
 
3.7%
p6953
 
3.4%
Other values (25)84451
41.3%

bedRoom
Real number (ℝ≥0)

HIGH CORRELATION

Distinct19
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.339036589
Minimum1
Maximum21
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size29.8 KiB
2023-09-13T00:54:53.865981image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q12
median3
Q34
95-th percentile6
Maximum21
Range20
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.877313975
Coefficient of variation (CV)0.5622322263
Kurtosis18.59948716
Mean3.339036589
Median Absolute Deviation (MAD)1
Skewness3.51084346
Sum12685
Variance3.52430776
MonotonicityNot monotonic
2023-09-13T00:54:54.552458image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
31545
40.7%
2990
26.1%
4675
17.8%
5213
 
5.6%
1130
 
3.4%
675
 
2.0%
941
 
1.1%
830
 
0.8%
1228
 
0.7%
728
 
0.7%
Other values (9)44
 
1.2%
ValueCountFrequency (%)
1130
 
3.4%
2990
26.1%
31545
40.7%
4675
17.8%
5213
 
5.6%
675
 
2.0%
728
 
0.7%
830
 
0.8%
941
 
1.1%
1020
 
0.5%
ValueCountFrequency (%)
211
 
< 0.1%
201
 
< 0.1%
192
 
0.1%
182
 
0.1%
1612
0.3%
141
 
< 0.1%
134
 
0.1%
1228
0.7%
111
 
< 0.1%
1020
0.5%

bathroom
Real number (ℝ≥0)

HIGH CORRELATION

Distinct19
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.405896288
Minimum1
Maximum21
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size29.8 KiB
2023-09-13T00:54:54.945583image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q12
median3
Q34
95-th percentile6
Maximum21
Range20
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.930609559
Coefficient of variation (CV)0.5668433199
Kurtosis17.75649285
Mean3.405896288
Median Absolute Deviation (MAD)1
Skewness3.258740355
Sum12939
Variance3.727253271
MonotonicityNot monotonic
2023-09-13T00:54:55.274650image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
31112
29.3%
21102
29.0%
4839
22.1%
5299
 
7.9%
1160
 
4.2%
6119
 
3.1%
741
 
1.1%
941
 
1.1%
826
 
0.7%
1222
 
0.6%
Other values (9)38
 
1.0%
ValueCountFrequency (%)
1160
 
4.2%
21102
29.0%
31112
29.3%
4839
22.1%
5299
 
7.9%
6119
 
3.1%
741
 
1.1%
826
 
0.7%
941
 
1.1%
109
 
0.2%
ValueCountFrequency (%)
211
 
< 0.1%
203
 
0.1%
184
 
0.1%
173
 
0.1%
168
 
0.2%
142
 
0.1%
134
 
0.1%
1222
0.6%
114
 
0.1%
109
0.2%

balcony
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size216.5 KiB
3+
1202 
3
1110 
2
921 
1
376 
0
190 

Length

Max length2
Median length1
Mean length1.316399052
Min length1

Characters and Unicode

Total characters5001
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row1
4th row1
5th row3+

Common Values

ValueCountFrequency (%)
3+1202
31.6%
31110
29.2%
2921
24.2%
1376
 
9.9%
0190
 
5.0%

Length

2023-09-13T00:54:55.659213image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-09-13T00:54:56.052113image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
32312
60.9%
2921
 
24.2%
1376
 
9.9%
0190
 
5.0%

Most occurring characters

ValueCountFrequency (%)
32312
46.2%
+1202
24.0%
2921
 
18.4%
1376
 
7.5%
0190
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3799
76.0%
Math Symbol1202
 
24.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
32312
60.9%
2921
 
24.2%
1376
 
9.9%
0190
 
5.0%
Math Symbol
ValueCountFrequency (%)
+1202
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common5001
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
32312
46.2%
+1202
24.0%
2921
 
18.4%
1376
 
7.5%
0190
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII5001
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
32312
46.2%
+1202
24.0%
2921
 
18.4%
1376
 
7.5%
0190
 
3.8%

floorNum
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct43
Distinct (%)1.1%
Missing19
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean6.81031746
Minimum0
Maximum51
Zeros134
Zeros (%)3.5%
Negative0
Negative (%)0.0%
Memory size29.8 KiB
2023-09-13T00:54:56.567880image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median5
Q310
95-th percentile18
Maximum51
Range51
Interquartile range (IQR)8

Descriptive statistics

Standard deviation6.027742988
Coefficient of variation (CV)0.8850898689
Kurtosis4.555255785
Mean6.81031746
Median Absolute Deviation (MAD)3
Skewness1.70027758
Sum25743
Variance36.33368553
MonotonicityNot monotonic
2023-09-13T00:54:57.035275image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=43)
ValueCountFrequency (%)
3513
13.5%
2506
13.3%
1363
 
9.6%
4328
 
8.6%
8197
 
5.2%
6187
 
4.9%
10186
 
4.9%
7183
 
4.8%
5177
 
4.7%
9170
 
4.5%
Other values (33)970
25.5%
ValueCountFrequency (%)
0134
 
3.5%
1363
9.6%
2506
13.3%
3513
13.5%
4328
8.6%
5177
 
4.7%
6187
 
4.9%
7183
 
4.8%
8197
 
5.2%
9170
 
4.5%
ValueCountFrequency (%)
511
 
< 0.1%
451
 
< 0.1%
441
 
< 0.1%
432
0.1%
402
0.1%
392
0.1%
381
 
< 0.1%
352
0.1%
342
0.1%
334
0.1%

facing
Categorical

MISSING

Distinct8
Distinct (%)0.3%
Missing1104
Missing (%)29.1%
Memory size202.6 KiB
East
640 
North-East
638 
North
398 
West
255 
South
233 
Other values (3)
531 

Length

Max length10
Median length5
Mean length6.836734694
Min length4

Characters and Unicode

Total characters18425
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNorth-West
2nd rowNorth-East
3rd rowNorth-East
4th rowEast
5th rowNorth-East

Common Values

ValueCountFrequency (%)
East640
16.8%
North-East638
16.8%
North398
 
10.5%
West255
 
6.7%
South233
 
6.1%
North-West200
 
5.3%
South-East174
 
4.6%
South-West157
 
4.1%
(Missing)1104
29.1%

Length

2023-09-13T00:54:57.501624image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-09-13T00:54:57.974678image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
east640
23.7%
north-east638
23.7%
north398
14.8%
west255
 
9.5%
south233
 
8.6%
north-west200
 
7.4%
south-east174
 
6.5%
south-west157
 
5.8%

Most occurring characters

ValueCountFrequency (%)
t3864
21.0%
s2064
11.2%
o1800
9.8%
h1800
9.8%
E1452
 
7.9%
a1452
 
7.9%
N1236
 
6.7%
r1236
 
6.7%
-1169
 
6.3%
W612
 
3.3%
Other values (3)1740
9.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter13392
72.7%
Uppercase Letter3864
 
21.0%
Dash Punctuation1169
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t3864
28.9%
s2064
15.4%
o1800
13.4%
h1800
13.4%
a1452
 
10.8%
r1236
 
9.2%
e612
 
4.6%
u564
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
E1452
37.6%
N1236
32.0%
W612
15.8%
S564
 
14.6%
Dash Punctuation
ValueCountFrequency (%)
-1169
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin17256
93.7%
Common1169
 
6.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
t3864
22.4%
s2064
12.0%
o1800
10.4%
h1800
10.4%
E1452
 
8.4%
a1452
 
8.4%
N1236
 
7.2%
r1236
 
7.2%
W612
 
3.5%
e612
 
3.5%
Other values (2)1128
 
6.5%
Common
ValueCountFrequency (%)
-1169
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII18425
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t3864
21.0%
s2064
11.2%
o1800
9.8%
h1800
9.8%
E1452
 
7.9%
a1452
 
7.9%
N1236
 
6.7%
r1236
 
6.7%
-1169
 
6.3%
W612
 
3.3%
Other values (3)1740
9.4%

agePossession
Categorical

HIGH CORRELATION

Distinct6
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size261.2 KiB
Relatively New
1676 
New Property
625 
Moderately Old
575 
Undefined
332 
Old Property
310 

Length

Max length18
Median length14
Mean length13.36667544
Min length9

Characters and Unicode

Total characters50780
Distinct characters25
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNew Property
2nd rowNew Property
3rd rowUnder Construction
4th rowNew Property
5th rowRelatively New

Common Values

ValueCountFrequency (%)
Relatively New1676
44.1%
New Property625
 
16.5%
Moderately Old575
 
15.1%
Undefined332
 
8.7%
Old Property310
 
8.2%
Under Construction281
 
7.4%

Length

2023-09-13T00:54:58.495762image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-09-13T00:54:58.935299image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
new2301
31.7%
relatively1676
23.1%
property935
12.9%
old885
 
12.2%
moderately575
 
7.9%
undefined332
 
4.6%
under281
 
3.9%
construction281
 
3.9%

Most occurring characters

ValueCountFrequency (%)
e8683
17.1%
l4812
 
9.5%
t3748
 
7.4%
3467
 
6.8%
y3186
 
6.3%
r3007
 
5.9%
d2405
 
4.7%
N2301
 
4.5%
w2301
 
4.5%
i2289
 
4.5%
Other values (15)14581
28.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter40047
78.9%
Uppercase Letter7266
 
14.3%
Space Separator3467
 
6.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e8683
21.7%
l4812
12.0%
t3748
9.4%
y3186
 
8.0%
r3007
 
7.5%
d2405
 
6.0%
w2301
 
5.7%
i2289
 
5.7%
a2251
 
5.6%
o2072
 
5.2%
Other values (7)5293
13.2%
Uppercase Letter
ValueCountFrequency (%)
N2301
31.7%
R1676
23.1%
P935
12.9%
O885
 
12.2%
U613
 
8.4%
M575
 
7.9%
C281
 
3.9%
Space Separator
ValueCountFrequency (%)
3467
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin47313
93.2%
Common3467
 
6.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e8683
18.4%
l4812
 
10.2%
t3748
 
7.9%
y3186
 
6.7%
r3007
 
6.4%
d2405
 
5.1%
N2301
 
4.9%
w2301
 
4.9%
i2289
 
4.8%
a2251
 
4.8%
Other values (14)12330
26.1%
Common
ValueCountFrequency (%)
3467
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII50780
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e8683
17.1%
l4812
 
9.5%
t3748
 
7.4%
3467
 
6.8%
y3186
 
6.3%
r3007
 
5.9%
d2405
 
4.7%
N2301
 
4.5%
w2301
 
4.5%
i2289
 
4.5%
Other values (15)14581
28.7%

super_built_up_area
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct593
Distinct (%)31.0%
Missing1886
Missing (%)49.6%
Infinite0
Infinite (%)0.0%
Mean1921.653189
Minimum89
Maximum10000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size29.8 KiB
2023-09-13T00:54:59.452561image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum89
5-th percentile760.2
Q11457
median1828
Q32215
95-th percentile3187.8
Maximum10000
Range9911
Interquartile range (IQR)758

Descriptive statistics

Standard deviation767.0577436
Coefficient of variation (CV)0.3991655456
Kurtosis10.10218731
Mean1921.653189
Median Absolute Deviation (MAD)372
Skewness1.825863791
Sum3676122.55
Variance588377.582
MonotonicityNot monotonic
2023-09-13T00:54:59.940016image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
165038
 
1.0%
195038
 
1.0%
200026
 
0.7%
157825
 
0.7%
215023
 
0.6%
164022
 
0.6%
240820
 
0.5%
135019
 
0.5%
190019
 
0.5%
193018
 
0.5%
Other values (583)1665
43.8%
(Missing)1886
49.6%
ValueCountFrequency (%)
891
< 0.1%
1451
< 0.1%
1611
< 0.1%
2151
< 0.1%
2161
< 0.1%
3251
< 0.1%
3401
< 0.1%
3521
< 0.1%
3801
< 0.1%
4061
< 0.1%
ValueCountFrequency (%)
100001
< 0.1%
69261
< 0.1%
60001
< 0.1%
58002
0.1%
55141
< 0.1%
53502
0.1%
52002
0.1%
48901
< 0.1%
48572
0.1%
48482
0.1%

built_up_area
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
SKEWED

Distinct644
Distinct (%)37.2%
Missing2067
Missing (%)54.4%
Infinite0
Infinite (%)0.0%
Mean2361.075848
Minimum2
Maximum737147
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size29.8 KiB
2023-09-13T00:55:00.469434image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile245.95
Q11100
median1650
Q32399.25
95-th percentile4663.5
Maximum737147
Range737145
Interquartile range (IQR)1299.25

Descriptive statistics

Standard deviation17724.68689
Coefficient of variation (CV)7.507038331
Kurtosis1709.127429
Mean2361.075848
Median Absolute Deviation (MAD)641.5
Skewness41.20580346
Sum4089383.369
Variance314164525.5
MonotonicityNot monotonic
2023-09-13T00:55:00.918088image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
180041
 
1.1%
324037
 
1.0%
135034
 
0.9%
190034
 
0.9%
270033
 
0.9%
90028
 
0.7%
160026
 
0.7%
200025
 
0.7%
130025
 
0.7%
170023
 
0.6%
Other values (634)1426
37.5%
(Missing)2067
54.4%
ValueCountFrequency (%)
21
 
< 0.1%
141
 
< 0.1%
301
 
< 0.1%
331
 
< 0.1%
503
0.1%
531
 
< 0.1%
551
 
< 0.1%
561
 
< 0.1%
571
 
< 0.1%
605
0.1%
ValueCountFrequency (%)
7371471
 
< 0.1%
135001
 
< 0.1%
112861
 
< 0.1%
95001
 
< 0.1%
90007
0.2%
87751
 
< 0.1%
82861
 
< 0.1%
8067.81
 
< 0.1%
80001
 
< 0.1%
75002
 
0.1%

carpet_area
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
SKEWED

Distinct733
Distinct (%)37.8%
Missing1858
Missing (%)48.9%
Infinite0
Infinite (%)0.0%
Mean2486.290952
Minimum15
Maximum607936
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size29.8 KiB
2023-09-13T00:55:01.515095image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile348
Q1830
median1295
Q31790
95-th percentile2950
Maximum607936
Range607921
Interquartile range (IQR)960

Descriptive statistics

Standard deviation22392.41749
Coefficient of variation (CV)9.006354412
Kurtosis626.8688104
Mean2486.290952
Median Absolute Deviation (MAD)473
Skewness24.77697875
Sum4825890.738
Variance501420360.9
MonotonicityNot monotonic
2023-09-13T00:55:02.059746image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
140042
 
1.1%
180036
 
0.9%
160036
 
0.9%
120032
 
0.8%
150030
 
0.8%
135028
 
0.7%
165028
 
0.7%
145023
 
0.6%
130023
 
0.6%
100022
 
0.6%
Other values (723)1641
43.2%
(Missing)1858
48.9%
ValueCountFrequency (%)
151
 
< 0.1%
331
 
< 0.1%
481
 
< 0.1%
501
 
< 0.1%
591
 
< 0.1%
601
 
< 0.1%
661
 
< 0.1%
721
 
< 0.1%
76.443
0.1%
77.312
0.1%
ValueCountFrequency (%)
6079361
< 0.1%
5692431
< 0.1%
5143961
< 0.1%
645291
< 0.1%
644121
< 0.1%
581411
< 0.1%
549171
< 0.1%
488111
< 0.1%
459661
< 0.1%
344011
< 0.1%

study room
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size215.3 KiB
0
3079 
1
720 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3799
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
03079
81.0%
1720
 
19.0%

Length

2023-09-13T00:55:02.494858image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-09-13T00:55:02.946042image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
03079
81.0%
1720
 
19.0%

Most occurring characters

ValueCountFrequency (%)
03079
81.0%
1720
 
19.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3799
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
03079
81.0%
1720
 
19.0%

Most occurring scripts

ValueCountFrequency (%)
Common3799
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
03079
81.0%
1720
 
19.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3799
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
03079
81.0%
1720
 
19.0%

servant room
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size215.3 KiB
0
2443 
1
1356 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3799
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
02443
64.3%
11356
35.7%

Length

2023-09-13T00:55:03.267032image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-09-13T00:55:03.690830image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
02443
64.3%
11356
35.7%

Most occurring characters

ValueCountFrequency (%)
02443
64.3%
11356
35.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3799
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
02443
64.3%
11356
35.7%

Most occurring scripts

ValueCountFrequency (%)
Common3799
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
02443
64.3%
11356
35.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII3799
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
02443
64.3%
11356
35.7%

store room
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size215.3 KiB
0
3455 
1
 
344

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3799
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
03455
90.9%
1344
 
9.1%

Length

2023-09-13T00:55:04.061902image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-09-13T00:55:04.461589image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
03455
90.9%
1344
 
9.1%

Most occurring characters

ValueCountFrequency (%)
03455
90.9%
1344
 
9.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3799
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
03455
90.9%
1344
 
9.1%

Most occurring scripts

ValueCountFrequency (%)
Common3799
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
03455
90.9%
1344
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII3799
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
03455
90.9%
1344
 
9.1%

pooja room
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size215.3 KiB
0
3136 
1
663 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3799
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
03136
82.5%
1663
 
17.5%

Length

2023-09-13T00:55:04.839778image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-09-13T00:55:05.233998image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
03136
82.5%
1663
 
17.5%

Most occurring characters

ValueCountFrequency (%)
03136
82.5%
1663
 
17.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3799
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
03136
82.5%
1663
 
17.5%

Most occurring scripts

ValueCountFrequency (%)
Common3799
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
03136
82.5%
1663
 
17.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII3799
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
03136
82.5%
1663
 
17.5%

others
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size215.3 KiB
0
3379 
1
420 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3799
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
03379
88.9%
1420
 
11.1%

Length

2023-09-13T00:55:05.602536image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-09-13T00:55:05.986637image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
03379
88.9%
1420
 
11.1%

Most occurring characters

ValueCountFrequency (%)
03379
88.9%
1420
 
11.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3799
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
03379
88.9%
1420
 
11.1%

Most occurring scripts

ValueCountFrequency (%)
Common3799
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
03379
88.9%
1420
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII3799
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
03379
88.9%
1420
 
11.1%

furnishing_type
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size215.3 KiB
0
2505 
1
1078 
2
 
216

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3799
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
02505
65.9%
11078
28.4%
2216
 
5.7%

Length

2023-09-13T00:55:06.302252image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2023-09-13T00:55:06.675418image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
02505
65.9%
11078
28.4%
2216
 
5.7%

Most occurring characters

ValueCountFrequency (%)
02505
65.9%
11078
28.4%
2216
 
5.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3799
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
02505
65.9%
11078
28.4%
2216
 
5.7%

Most occurring scripts

ValueCountFrequency (%)
Common3799
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
02505
65.9%
11078
28.4%
2216
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII3799
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
02505
65.9%
11078
28.4%
2216
 
5.7%

luxury_score
Real number (ℝ≥0)

MISSING
ZEROS

Distinct161
Distinct (%)4.4%
Missing147
Missing (%)3.9%
Infinite0
Infinite (%)0.0%
Mean71.26286966
Minimum0
Maximum174
Zeros457
Zeros (%)12.0%
Negative0
Negative (%)0.0%
Memory size29.8 KiB
2023-09-13T00:55:07.058778image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q132
median59
Q3109.25
95-th percentile174
Maximum174
Range174
Interquartile range (IQR)77.25

Descriptive statistics

Standard deviation52.73792625
Coefficient of variation (CV)0.7400477485
Kurtosis-0.8532173463
Mean71.26286966
Median Absolute Deviation (MAD)37
Skewness0.4671366724
Sum260252
Variance2781.288865
MonotonicityNot monotonic
2023-09-13T00:55:07.461904image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0457
 
12.0%
49342
 
9.0%
174192
 
5.1%
4460
 
1.6%
3858
 
1.5%
7256
 
1.5%
16553
 
1.4%
3748
 
1.3%
6048
 
1.3%
4544
 
1.2%
Other values (151)2294
60.4%
(Missing)147
 
3.9%
ValueCountFrequency (%)
0457
12.0%
56
 
0.2%
66
 
0.2%
741
 
1.1%
826
 
0.7%
99
 
0.2%
127
 
0.2%
1310
 
0.3%
1412
 
0.3%
1542
 
1.1%
ValueCountFrequency (%)
174192
5.1%
1691
 
< 0.1%
1688
 
0.2%
16720
 
0.5%
16610
 
0.3%
16553
 
1.4%
1613
 
0.1%
16027
 
0.7%
15921
 
0.6%
15831
 
0.8%

Interactions

2023-09-13T00:54:38.093114image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:50.003404image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:54.838686image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:59.836825image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:04.802241image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:09.303087image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:14.219041image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:19.400690image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:23.902405image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:28.384066image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:33.103026image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:38.601423image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:50.520739image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:55.278458image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:00.317344image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:05.219626image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:09.752067image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:14.671095image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:19.812507image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:24.328211image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:28.839405image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:33.551931image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:39.096842image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:50.951057image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:55.670514image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:00.737046image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:05.601738image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:10.194499image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:15.128773image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:20.211260image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:24.727635image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:29.268911image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:33.960497image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:39.538654image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:51.381596image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:56.103790image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:01.336627image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:06.002668image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:10.644399image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:15.569620image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:20.627755image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:25.153508image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:29.702070image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:34.393849image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:39.943019image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:51.757498image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:56.479081image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:01.722106image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:06.382205image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:11.039598image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:15.986051image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:21.002540image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:25.540704image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:30.103990image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:34.832287image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:40.409626image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:52.220734image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:57.054868image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:02.191756image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:06.812476image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:11.511897image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:16.683517image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:21.458189image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:26.004426image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:30.573916image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:35.525228image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:40.876217image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:52.687301image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:57.620085image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:02.645356image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:07.256185image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:11.998508image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:17.160764image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:21.894614image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:26.419039image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:31.043997image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:35.966031image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:41.293499image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:53.104409image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:58.107244image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:03.062030image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:07.639522image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:12.423751image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:17.602679image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:22.278058image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:26.785713image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:31.463453image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:36.376592image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:41.752110image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:53.520573image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:58.569783image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:03.476787image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:08.036497image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:12.861497image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:18.019454image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:22.669152image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:27.177965image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:31.820689image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:36.797825image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:42.264837image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:53.970536image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:58.995557image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:03.920041image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:08.461678image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:13.320738image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:18.489641image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:23.076150image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:27.546197image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:32.275531image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:37.207793image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:42.718592image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:54.404935image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:53:59.403658image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:04.345754image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:08.894058image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:13.761573image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:18.927918image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:23.475255image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:27.968961image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:32.651953image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-09-13T00:54:37.637383image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2023-09-13T00:55:07.912617image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2023-09-13T00:55:08.808953image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2023-09-13T00:55:09.720550image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2023-09-13T00:55:10.554885image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2023-09-13T00:55:11.251063image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2023-09-13T00:54:43.613223image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2023-09-13T00:54:45.686724image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-09-13T00:54:46.724738image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2023-09-13T00:54:47.551439image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexproperty_typesocietysectorpriceprice_per_sqftareaareaWithTypebedRoombathroombalconyfloorNumfacingagePossessionsuper_built_up_areabuilt_up_areacarpet_areastudy roomservant roomstore roompooja roomothersfurnishing_typeluxury_score
00flatsignature global park 4sector 360.827585.01081.0Super Built up area 1081(100.43 sq.m.)Carpet area: 650 sq.ft. (60.39 sq.m.)3222.0NaNNew Property1081.0NaN650.00000008.0
11flatsmart world gemssector 890.958600.01105.0Carpet area: 1103 (102.47 sq.m.)2224.0NaNNew PropertyNaNNaN1103.011000038.0
22flatpyramid elitesector 860.4679.058228.0Carpet area: 58141 (5401.48 sq.m.)2210.0NaNUnder ConstructionNaNNaN58141.000000015.0
33flatbreez global hill viewsohna road0.325470.0585.0Built Up area: 1000 (92.9 sq.m.)Carpet area: 585 sq.ft. (54.35 sq.m.)22117.0NaNNew PropertyNaN1000.00585.000000049.0
44flatbestech park view sanskrutisector 921.608020.01995.0Super Built up area 1995(185.34 sq.m.)Built Up area: 1615 sq.ft. (150.04 sq.m.)Carpet area: 1476 sq.ft. (137.12 sq.m.)343+10.0North-WestRelatively New1995.01615.001476.0010011174.0
55flatsuncity avenuesector 1020.489022.0532.0Super Built up area 632(58.71 sq.m.)Carpet area: 532 sq.ft. (49.42 sq.m.)2215.0North-EastRelatively New632.0NaN532.0001000159.0
66flatparas quartiergwal pahari7.5014018.05350.0Super Built up area 5350(497.03 sq.m.)443+20.0North-EastNew Property5350.0NaNNaN01011149.0
77flatexperion the heartsongsector 1082.008554.02338.0Super Built up area 2338(217.21 sq.m.)333+14.0EastRelatively New2338.0NaNNaN01000095.0
88flatadani m2k oyster grandesector 1021.909105.02087.0Super Built up area 1889(175.49 sq.m.)3438.0North-EastRelatively New1889.0NaNNaN010000165.0
99houseindependentsector 1051.2010122.01186.0Plot area 1185.51(110.14 sq.m.)6212.0North-WestOld PropertyNaN1185.51NaN0000009.0

Last rows

df_indexproperty_typesocietysectorpriceprice_per_sqftareaareaWithTypebedRoombathroombalconyfloorNumfacingagePossessionsuper_built_up_areabuilt_up_areacarpet_areastudy roomservant roomstore roompooja roomothersfurnishing_typeluxury_score
37893793flatgls arawali homessohna road0.274687.0576.0Carpet area: 576 (53.51 sq.m.)2221.0EastNew PropertyNaNNaN576.0000000NaN
37903794houseindependentsector 278.0026298.03042.0Plot area 338(282.61 sq.m.)9934.0North-EastRelatively NewNaN3042.0NaN111102NaN
37913795flateldeco accoladesohna road0.875965.01459.0Super Built up area 1457(135.36 sq.m.)Carpet area: 849 sq.ft. (78.87 sq.m.)223+10.0NaNRelatively New1457.0NaN849.0100000NaN
37923796flatparas dewssector 1060.926642.01385.0Super Built up area 1385(128.67 sq.m.)Built Up area: 940 sq.ft. (87.33 sq.m.)Carpet area: 845 sq.ft. (78.5 sq.m.)223+2.0EastRelatively New1385.0940.0845.0000000NaN
37933797housesurendra homes dayaindependentd colonysector 60.7515625.0480.0Built Up area: 480 (44.59 sq.m.)4421.0NaNUndefinedNaN480.0NaN000000NaN
37943798flatpivotal devaansector 840.376346.0583.0Super Built up area 583(54.16 sq.m.)Carpet area: 483 sq.ft. (44.87 sq.m.)2215.0North-WestRelatively New583.0NaN483.0000000NaN
37953799houseinternational city by sobha phase 1sector 1096.009634.06228.0Plot area 692(578.6 sq.m.)553+2.0South-WestRelatively NewNaN6228.0NaN111100NaN
37963800flatansal api celebrity suitessector 20.608163.0735.0Super Built up area 735(68.28 sq.m.)1115.0North-EastModerately Old735.0NaNNaN000001NaN
37973801houseindependentsector 4315.5028233.05490.0Plot area 610(510.04 sq.m.)5633.0EastModerately OldNaN5490.0NaN111100NaN
37983802flatm3m ikonicsector 681.789128.01950.0Super Built up area 1950(181.16 sq.m.)Built Up area: 1845 sq.ft. (171.41 sq.m.)Carpet area: 1530 sq.ft. (142.14 sq.m.)333+27.0SouthRelatively New1950.01845.01530.0000001NaN